Prog for Finance- Data scraping S&P500 P6

by: AhmadR, 7 years ago


Hi all/Harrison,

I'm in middle of the Prog for Fin but having 2 problems with extracting data as per the training video as follows:

1) pandas_datareader.data command is not extracting Yahoo ticker history, I had a look at the pandas_datareader docs for remote data access solution but the code on it's page is showing the same error message that I'm getting. I've circumvented this problem by using "google" finance instead which has worked fine but it doesn't give me the adjusted stock close price.

2) Using "google" api worked fine until it came to extracting the S&P 500 data into csv folders when running "get_data_from_google()", for some reason the csv data for ticker "LMT" is not available on Google finance for some reason(verified) so it's tripping out the code as the download stops at "LMT" with a "Remote Data Error".

My questions are:

1) Is there another way of getting Yahoo ticker info as the adjusted price is best price to use?
2) How can I update code in "get_data_from_yahoo()" so that it ignores any tickers that have problems or are missing data so that it avoids tripping out?

The code is:

def get_data_from_google(reload_sp500=False):
    if reload_sp500:
        tickers = save_sp500_tickers()
    else:
        with open("sp500tickers.pickle", "rb") as f:
            tickers = pickle.load(f)
            
    if not os.path.exists('stock_dfs'):
        os.makedirs('stock_dfs')
        
    start = dt.datetime(2000,1,1)
    end = dt.datetime(2017,6,30)
    
    for ticker in tickers:
        print(ticker)
        if not os.path.exists('stock_dfs/{}.csv'.format(ticker)):
            df = web.DataReader(ticker, 'google', start, end)
            df.to_csv('stock_dfs/{}.csv'.format(ticker))
        else:
            print('Already have {}'.format(ticker))
            
get_data_from_google()




You must be logged in to post. Please login or register an account.



For #1:
edit also maybe this is easier: https://stackoverflow.com/questions/44045158/python-pandas-datareader-no-longer-works-for-yahoo-finance-changed-url

$ git clone https://github.com/rgkimball/pandas-datareader
$ cd pandas-datareader
$ git checkout fix-yahoo
$ pip install -e .

If you don't have/use git you can also just pull the files and install the latest PR, or maybe just try to update Pandas, it might already be done, I haven't confirmed.

For #2, You could put a try/except in the "for ticker in tickers"

-Harrison 7 years ago
Last edited 7 years ago

You must be logged in to post. Please login or register an account.


Thanks Harrison, I ended up getting through this issue by a manual work around, non-coding solution of creating a dummy file for the ticker file that was missing from Google finance so the code would see it as already being created and moved to next ticker; not ideal but it worked!  

Must go back now and fix the code properly... Thanks!

-AhmadR 7 years ago
Last edited 7 years ago

You must be logged in to post. Please login or register an account.